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LOW-ORDER EIGENVECTORS 

MIHAI CUCURINGUt, VINCENT D. BLONDEL AND PAUL VAN DOOREN* 

Abstract. We consider the problem of inferring meaningful spatial information in networks from incomplete 
information on the connection intensity between the nodes of the network. We consider two spatially distributed 
networks: a population migration flow network within the US, and a network of mobile phone calls between cities 
in Belgium. For both networks we use the eigenvectors of the Laplacian matrix constructed from the link intensities 
to obtain informative visualizations and capture natural geographical subdivisions. We observe that some low order 
eigenvectors localize very well and seem to reveal small geographically cohesive regions that match remarkably well 
with political and administrative boundaries. We discuss possible explanations for this observation by describing 
diffusion maps and localized eigenfunctions. In addition, we discuss a possible connection with the weighted graph 
cut problem, and provide numerical evidence supporting the idea that lower order eigenvectors point out local cuts 
in the network. However, we do not provide a formal and rigorous justification for our observations. 
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1. Introduction. Extensive research over the last decades has greatly increased our un- 
derstanding of the topology and the spatial distribution of many social, biological and tech- 
nological networks. This paper considers the problem of inferring meaningful spatial and 
structural information from incomplete data sets of pairwise interactions between nodes in a 
network. 

The way people interact in many aspects of everyday life often reflect surprisingly well 
geopolitical boundaries. This inhomogeneity of connections in networks leads to natural 
divisions, and identifying such divisions can provide valuable insight into how interactions 
in a network are influenced by its topology. The problem of finding the so-called network 
communities, i.e., groups of tightly connected nodes, has been extensively studied in recent 
years and many community detection algorithms exist with different levels of success [10]. 
In this paper, we consider two particular networks: a county-to-county migration network 
constructed from 1995-2000 US Census data, and a city-to-city communication network built 
from mobile phone data over a six months period in Belgium. Communities in these net- 
works emerge naturally and are revealed, often at different scales, by the eigenvectors of a 
normalized matrix constructed from the weighted adjacency matrix of the network. We dis- 
cuss possible explanations for this observation by describing diffusion maps and localized 
eigenfunctions. 

In the remaining part of this introduction we report on some related contributions that 
deal with communities in networks and spectrum of matrices. However, in none of these 
contributions we were able to find an explanation of why low order eigenvectors localize so 
well and seem to identify meaningful geographical boundaries. 

One example of a recent study that is related to our work both in terms of the technique 
and end goal, is a paper by Ratti et al. [23]. Starting from measures of the communication 
intensities between counties in the UK, the authors propose a spectral modularity* optimiza- 
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tion algorithm that partitions the country into small non-overlapping geographically cohesive 
regions that correspond remarkably well with administrative regions. 

In [27], the authors Shi and Malik develop a spectral-based algorithm that solves the 
perceptual grouping problem in computer vision by treating the task of image segmentation 
as a graph partitioning problem. Their approach is to segment the graph by introducing a 
new global criterion called normalized cut, that measures not just the dissimilarity between 
different groups but also the total similarity within the groups themselves. They successfully 
extract global impressions of a scene and provide a hierarchical description of it. 

In another recent paper [24], the authors connect mobile data from Telecom Italia Mobile 
to a series of human activities derived from data on commercial premises advertised through 
the Italian version of "Yellow Pages". The eigendecomposition of a specific correlation ma- 
trix provides a top eigenvector which clearly indicates a common underlying pattern to mobile 
phone usage in Rome, while the second and third eigenvectors indicate spatial variation that 
is very suggestive of temporally-related and activity-related patterns. 

Another line of work where lower order eigenvectors provide useful information comes 
from the community detection literature. Newman [20] shows that the modularity of a net- 
work can be expressed in terms of the top eigenvalues and eigenvectors of a matrix called 
the modularity matrix, which plays a role in the maximization of the modularity equivalent 
to that played by the Laplacian in standard spectral partitioning. In related work, Richardson 
et al. [25] extend previously available methods for spectral optimization of modularity by in- 
troducing a computationally efficient algorithm for spectral tripartitioning of a network using 
the top two eigenvectors of the modularity matrix. 

Recent work [9], co-authored by one of the authors of this paper, investigates the con- 
straints imposed by space on the network topology, and focuses on community detection by 
proposing a modularity function adapted to spatial networks. The proposed methods were 
tested on a large mobile phone network and computer-generated benchmarks, and showed 
that it is possible to factor out the effect of space in order to reveal more clearly any hidden 
structural similarities between the nodes. 

Finally, we point out a recent paper of Onnela et al. [21] who investigate social networks 
of individuals whose most frequent geographic locations are known. The authors classify 
the members into groups using community detection algorithms, and explore the relationship 
between their topological and geographic positions. 

This paper is organized as follows: Section 2 is an introduction to the diffusion map 
technique and some of its underlying theory. Section 3 contains the results of numerical sim- 
ulations in which we applied diffusion maps and eigenvector colorings to the US migration 
data set. In Section 4 we present the outcome of similar experiments on the Belgium mobile 
phone data set. In Section 5, we explore the connection with localized eigenfunctions, a phe- 
nomenon observed before in the mathematics and physics community. Finally, the last section 
is a summary and a discussion of possible extensions of our approach and its usefulness in 
other applications. 

2. Diffusion Maps and Eigenvector Colorings. This section is a brief introduction to 
the diffusion maps literature and references therein. We also clarify the notion of eigen- 
vector localizations and eigenvector coloring, that we use in subsequent sections. Diffusion 
maps were introduced in S. Lafon's Ph.D. Thesis [14] in 2004 as a dimensionality reduction 
tool, and connected data analysis and clustering techniques based on eigenvectors of similar- 
ity matrices with the geometric structure of non-linear manifolds. In recent years, diffusion 
maps have gained a lot of popularity. A nonexhaustive list of references to its underlying 
theory and applications includes [1, 6, 7, 8, 14]. Often called Laplacian eigenmaps, these 
manifold learning techniques identify significant variables that live in a lower dimensional 



Extracting spatial information from networks with low-order eigenvectors 



3 



space, while preserving the local proximity between data points. Consider a set of TV points 
V = {^1 , ^2, . . . , a^Ar} in an n-dimensional space R^, where each point (typically) character- 
izes an image (or an audio stream, text string, etc.). If two images Xi and Xj are similar, then 
small. A popular measure of similarity between points in is defined using 
the Gaussian kernel Wij = e~ll^^~^jll for some constant e, so that the closer Xi is from 
Xj, the larger Wij. The matrix W = {wij)i<i,j<N is symmetric and has positive coefficients. 
To normalize W, we define the diagonal matrix D, with Da = and define A by 

A = D-^W 

such that every row of A sums to 1 . 

Next, one may also define the symmetric matrix S = D~^^'^WD~^^'^ , which can also 
be written as 5 = D^^'^ AD~^^'^ and hence is similar to A. As a symmetric matrix, S 
has an orthogonal basis of eigenvectors vq, ^i, • • • , vn-i associated to the N real ordered 
eigenvalues 1 = Aq > Ai > ... > \n-i- If we decompose 5 as 5 = VKV^ with 
VV^ = V^V = / and A = i:>ia^(Ao, Ai, . . . , Aat-i), then A becomes A = ^A$^ 
where ^ = D~^^'^V and <l> = i:)^/^!/. Therefore A^l = ^A and the columns of ^ form 
a D-orthogonal basis of eigenvectors with columns V^o, V^i, • • • , V^at-i (i.e. (ipi^Dipj) = 
0, Vi 7^ j) associated to the N real eigenvalues Aq, Ai, . . . , Aat-i such that Aipi = Xiipi, for 
i = 0, 1,...A/' — 1. Also, ^^A = A^^ implies that the columns of ^ are left eigenvectors 
of A, which we denote by 0i, • , ^at-i. Since = /, it follows that the vectors 0^ 
and i/jj are bi-orthonormal = (5^,^. 

Note that since A is a row-stochastic matrix, Aq = 1 and ipo = (1,1,...,!)^, and we 
disregard this trivial eigenvalue/eigenvector pair as irrelevant. Using the stochasticity of A, 
we can interpret it as a random walk matrix on a weighted graph G = (V^E^W), where the 
set of nodes consists of the points Xi, and there is an edge between nodes i and j if and ony 
if Wij > 0. Taking this perspective, Aij denotes the transition probability from point Xi to Xj 
in one step time At = e 

Pr{x{t + e) = Xj\x{t) = Xi} = Aij. 

The parameter e can now be interpreted in two ways. On the one hand, it is the squared radius 
of the neighborhood used to infer local geometric and density information, in particular Wij 
is 0(1) when Xi and Xj are in a ball of radius y/e, but it is exponentially small for points that 
are more than apart. On the other hand, e represents the the discrete time step at which 
the random walk jumps from one point to another. 

Interpreting the eigenvectors as functions over our data set, the diffusion map (also called 
Laplacian eigenmap) maps points from the original space to the first k eigenvectors, C : V 
M^, is defined as 

(2.1) Ct{xj) = (AlV^i(j),A^2V^2(j),...,AiV^,(j)) 

where the meaning of the exponent t will be made clear in what follows. 

Usin^the left and right eigenvectors denoted earlier, we now write the entries of A as 
"^ij = ^r^o A^(/)^(z)?/;^(j), and note that A-^- = J^^Jq K^r{i)'4^r{j)- However, recall that 
the probability distribution of a random walk landing at location Xj after exactly t steps, given 
that is starts at point Xi is precisely given by the expression A\j = Pr{x(t) = Xj \x{0) = Xi}. 
Given the random walk interpretation, it is natural to quantify the similarity between two 
points according to the evolution of their probability distributions 
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where the weight ^ takes into account the empirical local density of the points by giving 
larger weight to the vertices of lower degree. Since Dt{i^ j) naturally depends on the random 
walk on the graph, it is denoted as the diffusion distance at time t. In the diffusion map 
introduced above, it is a matter of choice to tune the parameter t corresponding to the number 
of time steps of the random walk. Note that we used t = 1 in the diffusion maps embeddings 
throughout our simulations, and that using different values of t corresponds to rescaling the 
axis. The Euclidean distance between two points in the diffusion map space introduced in 

(2.1) is given by 

N-l 

(2.2) \\C{xi) - C{xj)\\^ = J2 (KMi) - KMj)f ■ 

Notice that the first eigenvalue Aq does not enter this expression, since it cancels out. More- 
over, as shown in [19], the expression (2.2) equals the diffusion distance D^{i^j), when 
k = N — 1, i.e., when ail N — 1 eigenvectors are considered. For ease of visualization, we 
used the top k = 2 eigenvectors for the projections shown in Figures 3.1, 3.2 and 4.1. 

Finally, we denote by Ci the coloring of the N data points given by the eigenvector i/ji, 
where the color of point Xk ^ V is given by the j-th entry in ipi, i.e. 

Ci{xk) = ^i{k), foralH = l,...,A^andA: = 0,...,iV- 1. 

We refer to Ci as an eigenvector coloring^ of order i. The top left plot in Figure 3.4 shows the 
eigenvector coloring of order A: = 1, together with the associated colorbar. In practice, only 
the first k eigenvectors are used in the diffusion map introduced in (2.1), with A: << — 1 
chosen such that \\ > • • • > \\ > 5 but A^^^ < 5, where ^ is a chosen tolerance. Typi- 
cally, only the top few eigenvectors of A are expected to contain meaningful information, but 
as illustrated by the eigenvector colorings shown in this paper, one can extract relevant infor- 
mation from eigenvectors of much lower order. The phenomenon of eigenvector localization 
occurs when most of the components of an eigenvector are zero or close to zero, and almost 
all the mass is localized on a relatively small subset of nodes. On the contrary, delocalized 
eigenvectors have most of their components small and of roughly the same magnitude. Fur- 
thermore, note there is no issue with the fact that the eigenvectors are defined up to a scalar. 
Since each of them is normalized and real, we can just consider eigenvectors of different 
sign, however this can only reverse the color map used, and does not change the localization 
phenomenon. 

3. US Census Migration Data. We apply the diffusion map technique to the 2000 US 
Census that reports the number of people that migrated from every county to every other 
county in the US during the 1995-2000 time frame [5, 22]. We denote by M = {Mij)i<i^j<N 
the total number of people that migrated between county i and county j (so Mij = Mji), 
where N = 3107 denotes the number of counties in mainland US. We let Pi denote the 
population of county i. Figures 3.1 and 3.2 show the results of the diffusion map technique 

for longitude and latitude colorings when the following kernels are used: = 

W^P = J^_^p. , and wj;p = 5500 The diffusion map resulting from these kernels place 
the Midwest closer to the west coast (Figure 3.1), but further from the east coast. Similarly, 
the colorings based on latitude reveal the north- south separation. Note that kernel I^^^^ does 
a better job at separating the east and west coasts. Figure 3. 1 (b), while kernel W^'^'^ highlights 
best the separation between north and south as shown in Figure 3.2 (c). Figure 3.3 shows the 
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histogram of the top 500 eigenvalues of the diffusion matrix A, when different kernels are 
used. 

Our kernel of choice for the eigenvector colorings in Figures 3.4 and 3.5 was as it 

produced more visually appealing results in terms of state boundary detection. For the same 
reason, we omit the numerical simulations where we used exponential weights to compute 
the similarity between the nodes. Note also that the spectrum of A = D~^W'^^^ in the left of 
Figure 3.3 is rather different from the other two spectra, with many more large eigenvalues 
and without a visible spectral gap. For the rest of this section, we drop the superscript from 
matrix VF^^^ and refer to it as VK. 




-100 -90 



(a) Map of USA, colored by longitude 




-91.83 -91.82 -91.81 -91.8 -91.79 -91.78 -91.77 -91.76 

,(1) M?, 



(b) Kernel 



^3 PiPj 



38.28 

38.26 



)1.84 -91.83 -91.82 



11.79 -91.78 -91.77 



(c) Kernel 



(2) ^ M,,- 
ij Pi+Pj 



Top 2 eigenvectors 



-91.85 -91.84 -91.83 -91.82 -91.81 -91.8 -91.79 -91.78 -91.77 -91.76 -91.75 



(d) Kernel 



(3) _ Mjj 
ij ~ PiPj 



Fig. 3.1. Diffusion map reconstructions from the top two eigenvectors, for various kernels, with nodes colored 
by longitude. 



In Figure 3.6 we plot the histograms of the entries of several eigenvectors of A. Note 
that the top eigenvector provides a meaningful partitioning that separates the East from the 
Midwest, and has its entries spread in the interval [—0.03, 0.03] with few entries of zero 
magnitude. On the other hand, the eigenvectors (})2s and (/)83 are localized in the sense 
that they have their larger entries localized on a specific subregion of the US map (highlighted 
in blue or red in the eigenvector colorings), while taking small values in magnitude on the rest 
of the domain. We explore in Section 5 the connection with the phenomenon of "localized 
eigenf unctions" of the Laplace operator. 

We use the rest of this section to provide a possible interpretation of the color coded 
regions that stand out in the eigenvector colorings in Figures 3.4 and 3.5. By interpreting 
the matrix as a weighted graph, we explore a possible connection of such geographically 
cohesive colored subgraphs with the graph partitioning problem. In general, the graph par- 
titioning problem seeks to decompose a graph into K disjoint subgraphs (clusters), while 
minimizing the sum of the weights of the "cut" edges, i.e., edges with endpoints in different 
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Fig. 3.2. Diffusion map reconstructions from the top two eigenvectors, for various kernels, with nodes colored 
by latitude. 
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Fig. 3.3. Histogram of the top 500 eigenvalues of matrix A for different kernels. 

clusters. Given the number of clusters K, the Weighted-Min-Cut problem is an optimization 
problem that computes a partition T^i , . . . , Vk of the vertex set, by minimizing the weights 
of the cut edges 



(3.1) 



Weighted Cut(7^i, ...,Vk) = Yl E^{Vi,Vi), 



where Ey^{X^ Y) = X^iex jeY ^ denotes the complement of X. For an extensive 

literature survey on spectral clustering algorithms we refer the reader to [30], and point out 
the popular spectral relaxation of (3.1) introduced by Shi and Malik [27]. 

When dividing a graph into two smaller subgraphs, one wishes to minimize the sum of 
the weights on the edges across two different subgraphs, and simultaneously, maximize the 
sum of the weights on the edges within the subgraphs. Alternatively, one tries to maximize the 
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Fig. 3.5. Selected colorings by lower order eigenvectors for the similarity matrix Wij = p ^ . 
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(a) 



(b) (/)7 



(C) (/)28 



(d) </>83 



Fig. 3.6. Histogram of the entries in the eigenvectors , ^7, 028 083 of matrix A = D -^W^-^"^ 

ratio between the latter quantity and the former, i.e., between the weights of the inside edges 
and the weights of the outside edges. To that end, we perform the following experiment, 
where we regard the US states as the clusters, and investigate the possibility that the isolated 
colored regions that emerge correspond to local cuts in the weighted graph. 

We denote by S the matrix of size N x N (N = 49 the number of mainland US states) 
that aggregates the similarities between counties at the level of states. In particular, if state i 
has k counties with indices xi, . . . , x/c, and state j has / counties with indices yi, . . . ,yi, then 
we consider the /c x / submatrix 



(3.2) 



{xi,...,xk},{yi,---,yi} 



and denote by Sij the sum of the kl entries in Wij. In other words, matrix 5 is a "state- 
collapsed" version of the matrix W, and gives a measure of similarity between pairs of states. 
The heatmap in figure 3.7 shows the components of the matrix on a logarithmic scale, 
where the intensity of entry (z, j) denotes the aggregated similarity between states i and j. 

We refer to the diagonal entry Sa as the "inside degree" of state i, d]^ = Sa, which 
measures the internal similarity between the counties of state i. We denote by d^^^ = 
Y^u=i u^i '^hu (i-^-' the sum of the non-diagonal elements in row i) the "outside degree" 
of node i, which measures the similarity /migration between the counties of state i and all 

other counties outside of state i. Finally, we denote by d'^^^'^^ = the "ratio degree" of 
node i which straddles the boundary between intra-state and inter-state migration. A large 
ratio degree is a good indicative that a state is very well connected internally, and has little 
connectivity with the outside world, and thus is a good candidate for a cluster. In other words, 
a large "ratio degree" of a cluster (i.e., state) denotes a high measure of separation between 
that cluster and its environment, which is something discovered by the localization properties 
of the low-order eigenvectors. Table 3 ranks the top 15 states within the US in terms of their 
ratio degree. 

Next, we examine the top several eigenvector colorings in Figure 3.4, and point out the 
individual states on which the eigenvectors localize, together with its rank in terms of "ratio 
degree". Note that the entries of large magnitude are colored in red and blue, while the rest 
of the spectrum denotes values of smaller magnitude or very close to zero. The top three 
eigenvectors correspond to global cuts between various coasts within the US. The only state 
that stands out individually is Michigan (MI) for k = 3, which has rank 2. For /c = 4, the 
largest entries correspond to counties in Virginia (VA) which is also ranked 1*^, and similarly 
for Wisconsin (WI) for /c = 5, ranked 14. For k = 6, the states colored in dark red and dark 
blue are Georgia (GA) with rank 3, and Missouri (MO) of rank 8. When k = 7, Michigan 
(MI), of rank 2, stands out as the only dark blue colored state. For /c = 8, we point out 
Georgia, rank 3, together with Mississippi (MS) of rank 11, and Louisiana (LA) of rank 10. 
Eigenvector k = 9 localizes mostly on Maine (ME) of rank 6, and the New York (NY) area 
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Fig. 3.7. Heatmap of the inter-state migration flows, where the rows and columns of the matrix are sorted by 
the ratio degrees of the states. The intensity of entry denotes, on a logarithmic scale, the similarity between 
states i and j, i.e., the sum of all entries in the submatrix Wij defined in 3.2. Table 3 lists the top 15 states in terms 
of ratio c 



with rank 7. Finally, eigenvector k = 10 localizes on a combination of states we already 
pointed out. We have thus enumerated nine states that stand out in the top ten eigenvector 
colorings, and all nine of them appear in Table 3 that ranks the top fifteen states in terms 
of "ratio degree". Although this experiment does not provide a formal justification for the 
eigenvector localization phenomenon, we believe it is a first step in providing evidence that 
the low order eigenvectors point out local cuts in the network. 

4. Belgium Mobile Network. In a recent work [13], we studied the anonymized mobile 
phone communication from a Belgian operator and derived a statistical model of interaction 
between cities, showing that inter-city communication intensity is characterized be a gravity 
model: the communication intensity between two cities is proportional to the product of their 
sizes divided by the square of their distance. In this section, we briefly describe the Belgium 
mobile data set, summarize the results in [13], and apply the diffusion map technique. We 
refer the reader to [2] for more information on the mobile phone data set. 

The data set contains anonymous communication patterns of 2.5 million mobile phone 
customers, grouped in 571 cities in Belgium over a period of six months in 2006 (see also 
[15] for a description of the data set). Every customer is associated with the ZIP code of 
her/his billing address. Note that calls involving other operators were filtered out, meaning 
that both the calling and receiving individuals in the data set are customers of the mobile 
phone company. Also, there is a link between two customers if at least three calls were made 
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in both directions during the six month interval. After this pre-processing, the network has 
2.5 million nodes and 38 million links. For every pair of customers we associate a commu- 
nication intensity by computing the total communication time in seconds. After grouping 
the customers into their corresponding cities, we compute T^j, the aggregated communica- 
tion time in seconds between the customers of city i and j, and denote the resulting matrix 
by T = {Tij)i<i^j<n- We denote by Nij the number of phone calls between cities i and 
j, by Rij = the average duration of a call, and by Pi the number of customers that 
have the zip code billing address of city i (from now on, we refer to Pi as the population of 
city i). Furthermore, the normalized number of phone calls with respect to the population of 
the cities is denoted by Nij = -^r^, and similarly the normalized communication time by 

Tij = -p^- Finally, D = {dij)i<i<^j<n represents the distances between the centroids of the 
areas of cities i and j. Using these quantities, we now consider the following three kernels: 

/ ji^o.ie \ 2 

Figure 4. 1 shows the diffusion map reconstructions for various matrices W that relate 
cities based on their communication intensities and population sizes. For W^'^^ and W^'^\ 
there is an obvious separation between the north and south parts of Belgium, which stems 
from the fact that the two regions belong to different linguistic groups. The same separation 
is emphasized by the colorings associated to the top eigenvector of matrix A, shown in Figure 
4.2. The remaining eigenvector colorings in Figure 4.2 clearly isolate various subregions in 
Belgium. For example, eigenvectors and ijjn highlight language communities (French, 
Dutch and German), while and isolate the regions of Liege and Limburg. 

5. Localized eigenfunctions. Let us first make more precise what is meant by a lo- 
calized eigenf unction. This phenomenon of localization occurs when there exist eigenfunc- 
tions supported by small regions of the domain, i.e. they are localized in these regions. An 
eigenfunction localized on a domain Vti has support onVti significantly larger than on the 
complement and yet it cannot vanish on since eigenfunctions of isolated eigen- 

values are real analytic functions and cannot vanish on any open set. This is also observed 
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(a) Map of Belgium, colored by latitude (b) Kernel W^^ = e~(^^J^^j )^/°-^' 
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(c) Kernel Vl^V^ = e ^ " " 



(d) Kernel W- 



(3) 



Tij denotes the 

T 

aggregated communication time in seconds, Nij the number of phone calls between cities i and j, Rij = 

the average duration of a call, and Pi the population of city i. We normalize by the population size by defining 

- AT- • - T 

Jv^J — p p ana itj — p p . 



in the histogram of the entries of the eigenvectors ^7, ^28 and 083 shown in Figure 3.6 and 
the corresponding colorings in the figures 3.4 and 3.5. In contrast, eigenfunctions that do not 
locaHze have their support "uniformly" distributed across the domain, similar to the case of 
eigenvector from Figure 3.6. For example, in the case of the unit interval, the eigenfunc- 
tions of the Laplacian are the sine or cosine functions (depending on the boundary conditions) 
with the larger eigenvalues corresponding to higher oscillations, and they are not localized in 
the sense that there is no specific subinterval that carries the most (potential) energy of the 
eigenfunction, and any subinterval supports an amount of energy that is proportional to its 
length. In other words, the energy of the top eigenfunctions is distributed uniformly across 
the domain, and similar results are known to hold for the disk and the sphere, where the 
Laplacian eigenvalues and eigenfunctions are explicitly known. 

The behavior observed in the eigenvector colorings from figures 3.4, 3.5 and 4.2 is related 
to the notion of localized eigenfunctions, a phenomenon observed before in the mathematics 
and physics community. The spectrum of the (continuous) Laplace operator has been exten- 
sively studied, and there exists a rich literature on the relationship between the spectrum and 
the geometry of the domain. As more complicated objects, eigenfunctions are more difficult 



Extracting spatial information from networks with low-order eigenvectors 



13 




Fig. 4.2. Colorings by the top 18 eigenvectors of A = D 



-^W^^\ where W^^^ = ^ 



P^Pj 



to analyze than the spectrum, and less is known about them. Most of the literature is fo- 
cused on high frequency eigenfunctions (associated to larger eigenvalues), such as [3, 4, 12], 
although recent studies such as [11] advocated localized eigenfunctions associated to small 
eigenvalues. In our experiments, we found the bottom eigenvectors uninteresting as they 
did not contain any meaningful geometric information. In his work, Sapoval [26] studied 
localized eigenfunctions in different domains and pointed out their importance for physical 
applications, such as designing efficient noise-protective walls. 

Finally, considering that A is a stochastic matrix, one may further explore ideas from 
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the theory of nearly completely decomposable matrices developed in 1961 by Nobel laureate 
Herbert Simon and his collaborator Albert Ando to describe and identify the short, medium 
and long-term behaviors of a dynamical system [28]. Very recent work [18] explores this idea 
in the context of stochastic data clustering, and proposes a technique that uses the evolution 
of the system to infer information on the initial structure. 

6. Summary and Discussion. We have shown how the diffusion map technique can be 
used to obtain informative visualizations and capture natural subdivisions within two different 
real networks. We find surprising that some low order eigenvectors localize very well and 
seem to reveal small geographically cohesive regions; it is natural to ask for an explanation 
for our observation. 

In looking at figures 3.4 and 3.5 many more questions come to mind. Are the state 
boundaries a consequence of people migrating within the same state or not? In other words, 
do states emerge as communities because of people migrating from one county to the other 
within the state, or because of similar migration patterns directed outside the state? Prelim- 
inary analysis on the migration data set in the context of local clustering on graphs supports 
the idea that the localized low-order eigenvectors highlight local cuts in the network. This is 
perhaps counter-intuitive since such low-order eigenvectors must satisfy the global require- 
ment of exact orthogonality with respect to all of the earlier delocalized eigenvectors, and 
they must do so while keeping most of their components zero or close to zero. Another ques- 
tion to consider is whether, besides the state boundary detection, the eigenvector colorings 
reveal any extra information on the intensity of the migration from one region to the other. 
Furthermore, inter-county migration is most common among young adults and declines as 
people age, and one may ask how the age composition (or income level) of individual US 
counties impacts the migration pattern. 

In answering these questions, one needs to complement the mathematical description 
of diffusion maps and clustering by eigenvectors with a socio-demographic behavioral inter- 
pretation of migration trends, as considered for example in [16, 17]. A more recent paper 
by Slater [29] is of particular interest since it analyzes migration patterns in the US Census 
data from 1965-1970 and 1995-2000. Amongst others, it highlights cosmopolitan or hub like 
regions, as well as isolated regions that emerge when there is a high measure of separation 
between a cluster and its environment. 

Another interesting direction worth exploring is seeing how the diffusion map recon- 
structions and colorings change when the matrices used are no longer symmetric. In the case 
of the US migration data, it may be the case that there are many states for which the most 
common migration destination is the major city/capital of that state (although there might be 
other destinations spread across the US that attract people migrating out from that state). It is 
therefore natural to expect that major cities will stand out in the colorings, however this is not 
the case in our simulations since we symmetrize the migration matrix and take into account 
both the in- and out-migration from a given state. 
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